Overview

Dataset statistics

Number of variables16
Number of observations18671
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory1.9 MiB
Average record size in memory104.0 B

Variable types

Numeric9
Categorical7

Warnings

survey_id has constant value "1476" Constant
city has constant value "Amsterdam" Constant
name has a high cardinality: 18150 distinct values High cardinality
last_modified has a high cardinality: 18671 distinct values High cardinality
location has a high cardinality: 18671 distinct values High cardinality
room_id is highly correlated with host_idHigh correlation
host_id is highly correlated with room_idHigh correlation
accommodates is highly correlated with bedrooms and 1 other fieldsHigh correlation
bedrooms is highly correlated with accommodatesHigh correlation
price is highly correlated with accommodatesHigh correlation
room_id is highly correlated with reviewsHigh correlation
reviews is highly correlated with room_id and 1 other fieldsHigh correlation
overall_satisfaction is highly correlated with reviewsHigh correlation
accommodates is highly correlated with bedrooms and 1 other fieldsHigh correlation
bedrooms is highly correlated with accommodates and 1 other fieldsHigh correlation
price is highly correlated with accommodates and 1 other fieldsHigh correlation
reviews is highly correlated with overall_satisfactionHigh correlation
overall_satisfaction is highly correlated with reviewsHigh correlation
accommodates is highly correlated with bedroomsHigh correlation
bedrooms is highly correlated with accommodatesHigh correlation
accommodates is highly correlated with bedroomsHigh correlation
latitude is highly correlated with longitude and 1 other fieldsHigh correlation
longitude is highly correlated with latitude and 1 other fieldsHigh correlation
host_id is highly correlated with room_idHigh correlation
neighborhood is highly correlated with latitude and 1 other fieldsHigh correlation
bedrooms is highly correlated with accommodatesHigh correlation
room_id is highly correlated with host_idHigh correlation
survey_id is highly correlated with city and 2 other fieldsHigh correlation
city is highly correlated with survey_id and 2 other fieldsHigh correlation
neighborhood is highly correlated with survey_id and 1 other fieldsHigh correlation
room_type is highly correlated with survey_id and 1 other fieldsHigh correlation
name is uniformly distributed Uniform
last_modified is uniformly distributed Uniform
location is uniformly distributed Uniform
room_id has unique values Unique
last_modified has unique values Unique
location has unique values Unique
reviews has 2973 (15.9%) zeros Zeros
overall_satisfaction has 5725 (30.7%) zeros Zeros
bedrooms has 1148 (6.1%) zeros Zeros

Reproduction

Analysis started2021-09-07 09:51:27.925443
Analysis finished2021-09-07 09:51:51.429514
Duration23.5 seconds
Software versionpandas-profiling v3.0.0
Download configurationconfig.json

Variables

room_id
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
UNIQUE

Distinct18671
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean11210389.23
Minimum2818
Maximum20003728
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size145.9 KiB

Quantile statistics

Minimum2818
5-th percentile1018186
Q16046700
median12296977
Q316624424.5
95-th percentile19578981.5
Maximum20003728
Range20000910
Interquartile range (IQR)10577724.5

Descriptive statistics

Standard deviation6087345.603
Coefficient of variation (CV)0.5430092996
Kurtosis-1.227842256
Mean11210389.23
Median Absolute Deviation (MAD)5238856
Skewness-0.2557031374
Sum2.093091773 × 1011
Variance3.705577649 × 1013
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
190750701
 
< 0.1%
159012651
 
< 0.1%
115970891
 
< 0.1%
13338861
 
< 0.1%
138062081
 
< 0.1%
35213991
 
< 0.1%
116148501
 
< 0.1%
182647071
 
< 0.1%
42666301
 
< 0.1%
82811171
 
< 0.1%
Other values (18661)18661
99.9%
ValueCountFrequency (%)
28181
< 0.1%
32091
< 0.1%
201681
< 0.1%
254281
< 0.1%
254881
< 0.1%
278861
< 0.1%
286581
< 0.1%
288711
< 0.1%
290511
< 0.1%
295541
< 0.1%
ValueCountFrequency (%)
200037281
< 0.1%
199960911
< 0.1%
199956731
< 0.1%
199953271
< 0.1%
199952461
< 0.1%
199951061
< 0.1%
199942621
< 0.1%
199926771
< 0.1%
199925961
< 0.1%
199922411
< 0.1%

survey_id
Categorical

CONSTANT
HIGH CORRELATION
REJECTED

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size73.0 KiB
1476
18671 

Length

Max length4
Median length4
Mean length4
Min length4

Characters and Unicode

Total characters74684
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1476
2nd row1476
3rd row1476
4th row1476
5th row1476

Common Values

ValueCountFrequency (%)
147618671
100.0%

Length

Histogram of lengths of the category

Pie chart

ValueCountFrequency (%)
147618671
100.0%

Most occurring characters

ValueCountFrequency (%)
118671
25.0%
418671
25.0%
718671
25.0%
618671
25.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number74684
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
118671
25.0%
418671
25.0%
718671
25.0%
618671
25.0%

Most occurring scripts

ValueCountFrequency (%)
Common74684
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
118671
25.0%
418671
25.0%
718671
25.0%
618671
25.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII74684
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
118671
25.0%
418671
25.0%
718671
25.0%
618671
25.0%

host_id
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION

Distinct15897
Distinct (%)85.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean35792660.12
Minimum2234
Maximum141831915
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size145.9 KiB

Quantile statistics

Minimum2234
5-th percentile1477248
Q17126211.5
median19884429
Q352033129
95-th percentile121991189
Maximum141831915
Range141829681
Interquartile range (IQR)44906917.5

Descriptive statistics

Standard deviation37613303.54
Coefficient of variation (CV)1.05086639
Kurtosis0.4850144549
Mean35792660.12
Median Absolute Deviation (MAD)15789616
Skewness1.242943803
Sum6.682847572 × 1011
Variance1.414760603 × 1015
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
4870338593
 
0.5%
11397756488
 
0.5%
146451071
 
0.4%
10774514264
 
0.3%
8445374061
 
0.3%
6585999054
 
0.3%
51721552
 
0.3%
4669167243
 
0.2%
8444958937
 
0.2%
66917836
 
0.2%
Other values (15887)18072
96.8%
ValueCountFrequency (%)
22341
< 0.1%
31591
< 0.1%
38061
< 0.1%
59882
< 0.1%
79241
< 0.1%
120851
< 0.1%
204051
< 0.1%
340801
< 0.1%
367011
< 0.1%
407861
< 0.1%
ValueCountFrequency (%)
1418319151
 
< 0.1%
1417491091
 
< 0.1%
1417478151
 
< 0.1%
1416651484
< 0.1%
1416580221
 
< 0.1%
1416486821
 
< 0.1%
1415512111
 
< 0.1%
1415487051
 
< 0.1%
1415423511
 
< 0.1%
1415346021
 
< 0.1%

room_type
Categorical

HIGH CORRELATION

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size73.0 KiB
Entire home/apt
14937 
Private room
3671 
Shared room
 
63

Length

Max length15
Median length15
Mean length14.39665792
Min length11

Characters and Unicode

Total characters268800
Distinct characters17
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowShared room
2nd rowShared room
3rd rowShared room
4th rowShared room
5th rowShared room

Common Values

ValueCountFrequency (%)
Entire home/apt14937
80.0%
Private room3671
 
19.7%
Shared room63
 
0.3%

Length

Histogram of lengths of the category

Pie chart

ValueCountFrequency (%)
home/apt14937
40.0%
entire14937
40.0%
room3734
 
10.0%
private3671
 
9.8%
shared63
 
0.2%

Most occurring characters

ValueCountFrequency (%)
e33608
12.5%
t33545
12.5%
r22405
8.3%
o22405
8.3%
a18671
 
6.9%
18671
 
6.9%
m18671
 
6.9%
i18608
 
6.9%
h15000
 
5.6%
E14937
 
5.6%
Other values (7)52279
19.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter216521
80.6%
Uppercase Letter18671
 
6.9%
Space Separator18671
 
6.9%
Other Punctuation14937
 
5.6%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e33608
15.5%
t33545
15.5%
r22405
10.3%
o22405
10.3%
a18671
8.6%
m18671
8.6%
i18608
8.6%
h15000
6.9%
n14937
6.9%
p14937
6.9%
Other values (2)3734
 
1.7%
Uppercase Letter
ValueCountFrequency (%)
E14937
80.0%
P3671
 
19.7%
S63
 
0.3%
Space Separator
ValueCountFrequency (%)
18671
100.0%
Other Punctuation
ValueCountFrequency (%)
/14937
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin235192
87.5%
Common33608
 
12.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
e33608
14.3%
t33545
14.3%
r22405
9.5%
o22405
9.5%
a18671
7.9%
m18671
7.9%
i18608
7.9%
h15000
6.4%
E14937
6.4%
n14937
6.4%
Other values (5)22405
9.5%
Common
ValueCountFrequency (%)
18671
55.6%
/14937
44.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII268800
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e33608
12.5%
t33545
12.5%
r22405
8.3%
o22405
8.3%
a18671
 
6.9%
18671
 
6.9%
m18671
 
6.9%
i18608
 
6.9%
h15000
 
5.6%
E14937
 
5.6%
Other values (7)52279
19.4%

city
Categorical

CONSTANT
HIGH CORRELATION
REJECTED

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size73.0 KiB
Amsterdam
18671 

Length

Max length9
Median length9
Mean length9
Min length9

Characters and Unicode

Total characters168039
Distinct characters8
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowAmsterdam
2nd rowAmsterdam
3rd rowAmsterdam
4th rowAmsterdam
5th rowAmsterdam

Common Values

ValueCountFrequency (%)
Amsterdam18671
100.0%

Length

Histogram of lengths of the category

Pie chart

ValueCountFrequency (%)
amsterdam18671
100.0%

Most occurring characters

ValueCountFrequency (%)
m37342
22.2%
A18671
11.1%
s18671
11.1%
t18671
11.1%
e18671
11.1%
r18671
11.1%
d18671
11.1%
a18671
11.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter149368
88.9%
Uppercase Letter18671
 
11.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
m37342
25.0%
s18671
12.5%
t18671
12.5%
e18671
12.5%
r18671
12.5%
d18671
12.5%
a18671
12.5%
Uppercase Letter
ValueCountFrequency (%)
A18671
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin168039
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
m37342
22.2%
A18671
11.1%
s18671
11.1%
t18671
11.1%
e18671
11.1%
r18671
11.1%
d18671
11.1%
a18671
11.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII168039
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
m37342
22.2%
A18671
11.1%
s18671
11.1%
t18671
11.1%
e18671
11.1%
r18671
11.1%
d18671
11.1%
a18671
11.1%

neighborhood
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct23
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size73.0 KiB
De Baarsjes / Oud West
3276 
De Pijp / Rivierenbuurt
2371 
Centrum West
2216 
Centrum Oost
1727 
Westerpark
1428 
Other values (18)
7653 

Length

Max length38
Median length15
Mean length17.5162016
Min length6

Characters and Unicode

Total characters327045
Distinct characters43
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowDe Pijp / Rivierenbuurt
2nd rowCentrum West
3rd rowWatergraafsmeer
4th rowCentrum West
5th rowDe Baarsjes / Oud West

Common Values

ValueCountFrequency (%)
De Baarsjes / Oud West3276
17.5%
De Pijp / Rivierenbuurt2371
12.7%
Centrum West2216
11.9%
Centrum Oost1727
9.2%
Westerpark1428
7.6%
Noord-West / Noord-Midden1415
7.6%
Oud Oost1166
 
6.2%
Bos en Lommer983
 
5.3%
Oostelijk Havengebied / Indische Buurt920
 
4.9%
Watergraafsmeer514
 
2.8%
Other values (13)2655
14.2%

Length

Histogram of lengths of the category
ValueCountFrequency (%)
8960
15.9%
de5761
10.3%
west5732
10.2%
oud4936
 
8.8%
centrum4042
 
7.2%
baarsjes3276
 
5.8%
oost3211
 
5.7%
pijp2371
 
4.2%
rivierenbuurt2371
 
4.2%
westerpark1428
 
2.5%
Other values (27)14097
25.1%

Most occurring characters

ValueCountFrequency (%)
e41006
 
12.5%
37514
 
11.5%
r24808
 
7.6%
s22145
 
6.8%
t22088
 
6.8%
u17123
 
5.2%
d14710
 
4.5%
o14559
 
4.5%
i12517
 
3.8%
a11891
 
3.6%
Other values (33)108684
33.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter228669
69.9%
Uppercase Letter49072
 
15.0%
Space Separator37514
 
11.5%
Other Punctuation8960
 
2.7%
Dash Punctuation2830
 
0.9%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e41006
17.9%
r24808
10.8%
s22145
9.7%
t22088
9.7%
u17123
7.5%
d14710
 
6.4%
o14559
 
6.4%
i12517
 
5.5%
a11891
 
5.2%
n11629
 
5.1%
Other values (13)36193
15.8%
Uppercase Letter
ValueCountFrequency (%)
O9230
18.8%
W9104
18.6%
D5803
11.8%
B5625
11.5%
C4042
8.2%
N3899
7.9%
P2371
 
4.8%
R2371
 
4.8%
M1415
 
2.9%
I1297
 
2.6%
Other values (7)3915
8.0%
Space Separator
ValueCountFrequency (%)
37514
100.0%
Other Punctuation
ValueCountFrequency (%)
/8960
100.0%
Dash Punctuation
ValueCountFrequency (%)
-2830
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin277741
84.9%
Common49304
 
15.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
e41006
14.8%
r24808
 
8.9%
s22145
 
8.0%
t22088
 
8.0%
u17123
 
6.2%
d14710
 
5.3%
o14559
 
5.2%
i12517
 
4.5%
a11891
 
4.3%
n11629
 
4.2%
Other values (30)85265
30.7%
Common
ValueCountFrequency (%)
37514
76.1%
/8960
 
18.2%
-2830
 
5.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII327045
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e41006
 
12.5%
37514
 
11.5%
r24808
 
7.6%
s22145
 
6.8%
t22088
 
6.8%
u17123
 
5.2%
d14710
 
4.5%
o14559
 
4.5%
i12517
 
3.8%
a11891
 
3.6%
Other values (33)108684
33.2%

reviews
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct283
Distinct (%)1.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean16.75721707
Minimum0
Maximum532
Zeros2973
Zeros (%)15.9%
Negative0
Negative (%)0.0%
Memory size145.9 KiB

Quantile statistics

Minimum0
5-th percentile0
Q12
median6
Q317
95-th percentile67
Maximum532
Range532
Interquartile range (IQR)15

Descriptive statistics

Standard deviation33.52959861
Coefficient of variation (CV)2.000904951
Kurtosis43.77759527
Mean16.75721707
Median Absolute Deviation (MAD)6
Skewness5.502837236
Sum312874
Variance1124.233983
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
02973
 
15.9%
11504
 
8.1%
21240
 
6.6%
31099
 
5.9%
4924
 
4.9%
5874
 
4.7%
6736
 
3.9%
7682
 
3.7%
8588
 
3.1%
9527
 
2.8%
Other values (273)7524
40.3%
ValueCountFrequency (%)
02973
15.9%
11504
8.1%
21240
6.6%
31099
 
5.9%
4924
 
4.9%
5874
 
4.7%
6736
 
3.9%
7682
 
3.7%
8588
 
3.1%
9527
 
2.8%
ValueCountFrequency (%)
5321
< 0.1%
4651
< 0.1%
4631
< 0.1%
4521
< 0.1%
4471
< 0.1%
4432
< 0.1%
4331
< 0.1%
4302
< 0.1%
4251
< 0.1%
4102
< 0.1%

overall_satisfaction
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct9
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.302956457
Minimum0
Maximum5
Zeros5725
Zeros (%)30.7%
Negative0
Negative (%)0.0%
Memory size145.9 KiB

Quantile statistics

Minimum0
5-th percentile0
Q10
median4.5
Q35
95-th percentile5
Maximum5
Range5
Interquartile range (IQR)5

Descriptive statistics

Standard deviation2.212851115
Coefficient of variation (CV)0.669960729
Kurtosis-1.314098148
Mean3.302956457
Median Absolute Deviation (MAD)0.5
Skewness-0.7945374407
Sum61669.5
Variance4.896710059
MonotonicityNot monotonic
Histogram with fixed size bins (bins=9)
ValueCountFrequency (%)
57693
41.2%
05725
30.7%
4.54546
24.3%
4576
 
3.1%
3.5109
 
0.6%
319
 
0.1%
11
 
< 0.1%
2.51
 
< 0.1%
1.51
 
< 0.1%
ValueCountFrequency (%)
05725
30.7%
11
 
< 0.1%
1.51
 
< 0.1%
2.51
 
< 0.1%
319
 
0.1%
3.5109
 
0.6%
4576
 
3.1%
4.54546
24.3%
57693
41.2%
ValueCountFrequency (%)
57693
41.2%
4.54546
24.3%
4576
 
3.1%
3.5109
 
0.6%
319
 
0.1%
2.51
 
< 0.1%
1.51
 
< 0.1%
11
 
< 0.1%
05725
30.7%

accommodates
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct16
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.922875047
Minimum1
Maximum17
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size145.9 KiB

Quantile statistics

Minimum1
5-th percentile2
Q12
median2
Q34
95-th percentile5
Maximum17
Range16
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.327895671
Coefficient of variation (CV)0.4543114741
Kurtosis14.35766347
Mean2.922875047
Median Absolute Deviation (MAD)0
Skewness2.390197061
Sum54573
Variance1.763306913
MonotonicityNot monotonic
Histogram with fixed size bins (bins=16)
ValueCountFrequency (%)
29991
53.5%
45571
29.8%
31579
 
8.5%
6473
 
2.5%
5471
 
2.5%
1365
 
2.0%
8105
 
0.6%
752
 
0.3%
1620
 
0.1%
1016
 
0.1%
Other values (6)28
 
0.1%
ValueCountFrequency (%)
1365
 
2.0%
29991
53.5%
31579
 
8.5%
45571
29.8%
5471
 
2.5%
6473
 
2.5%
752
 
0.3%
8105
 
0.6%
98
 
< 0.1%
1016
 
0.1%
ValueCountFrequency (%)
171
 
< 0.1%
1620
 
0.1%
146
 
< 0.1%
131
 
< 0.1%
1210
 
0.1%
112
 
< 0.1%
1016
 
0.1%
98
 
< 0.1%
8105
0.6%
752
0.3%

bedrooms
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct11
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.430989235
Minimum0
Maximum10
Zeros1148
Zeros (%)6.1%
Negative0
Negative (%)0.0%
Memory size145.9 KiB

Quantile statistics

Minimum0
5-th percentile0
Q11
median1
Q32
95-th percentile3
Maximum10
Range10
Interquartile range (IQR)1

Descriptive statistics

Standard deviation0.8792321975
Coefficient of variation (CV)0.6144226498
Kurtosis5.629498729
Mean1.430989235
Median Absolute Deviation (MAD)0
Skewness1.602014005
Sum26718
Variance0.773049257
MonotonicityNot monotonic
Histogram with fixed size bins (bins=11)
ValueCountFrequency (%)
111068
59.3%
24446
23.8%
31442
 
7.7%
01148
 
6.1%
4472
 
2.5%
562
 
0.3%
619
 
0.1%
105
 
< 0.1%
74
 
< 0.1%
83
 
< 0.1%
ValueCountFrequency (%)
01148
 
6.1%
111068
59.3%
24446
23.8%
31442
 
7.7%
4472
 
2.5%
562
 
0.3%
619
 
0.1%
74
 
< 0.1%
83
 
< 0.1%
92
 
< 0.1%
ValueCountFrequency (%)
105
 
< 0.1%
92
 
< 0.1%
83
 
< 0.1%
74
 
< 0.1%
619
 
0.1%
562
 
0.3%
4472
 
2.5%
31442
 
7.7%
24446
23.8%
111068
59.3%

price
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION

Distinct423
Distinct (%)2.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean166.6377805
Minimum12
Maximum6000
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size145.9 KiB

Quantile statistics

Minimum12
5-th percentile72
Q1108
median144
Q3192
95-th percentile330
Maximum6000
Range5988
Interquartile range (IQR)84

Descriptive statistics

Standard deviation108.9759641
Coefficient of variation (CV)0.6539691286
Kurtosis522.6742749
Mean166.6377805
Median Absolute Deviation (MAD)36
Skewness12.78844196
Sum3111294
Variance11875.76076
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1191016
 
5.4%
180998
 
5.3%
144884
 
4.7%
150619
 
3.3%
132588
 
3.1%
108560
 
3.0%
96517
 
2.8%
118508
 
2.7%
114507
 
2.7%
240492
 
2.6%
Other values (413)11982
64.2%
ValueCountFrequency (%)
121
 
< 0.1%
181
 
< 0.1%
211
 
< 0.1%
221
 
< 0.1%
231
 
< 0.1%
246
< 0.1%
251
 
< 0.1%
281
 
< 0.1%
292
 
< 0.1%
306
< 0.1%
ValueCountFrequency (%)
60001
< 0.1%
37701
< 0.1%
19201
< 0.1%
17991
< 0.1%
15581
< 0.1%
14281
< 0.1%
14121
< 0.1%
13861
< 0.1%
13431
< 0.1%
13191
< 0.1%

name
Categorical

HIGH CARDINALITY
UNIFORM

Distinct18150
Distinct (%)97.2%
Missing0
Missing (%)0.0%
Memory size73.0 KiB
Amsterdam
 
36
Lovely apartment near Vondelpark
 
10
Spacious family house with garden
 
8
Beautiful apartment in Amsterdam
 
8
Magnificent panoramic city view
 
8
Other values (18145)
18601 

Length

Max length78
Median length35
Mean length36.09233571
Min length1

Characters and Unicode

Total characters673880
Distinct characters157
Distinct categories20 ?
Distinct scripts4 ?
Distinct blocks10 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique17814 ?
Unique (%)95.4%

Sample

1st rowRed Light/ Canal view apartment (Shared)
2nd rowSunny and Cozy Living room in quite neighbours
3rd rowAmsterdam
4th rowCanal boat RIDE in Amsterdam
5th rowOne room for rent in a three room appartment

Common Values

ValueCountFrequency (%)
Amsterdam36
 
0.2%
Lovely apartment near Vondelpark10
 
0.1%
Spacious family house with garden8
 
< 0.1%
Beautiful apartment in Amsterdam8
 
< 0.1%
Magnificent panoramic city view8
 
< 0.1%
Cosy apartment in Amsterdam8
 
< 0.1%
Lovely apartment in Amsterdam7
 
< 0.1%
Nice comfy room, magnificent view7
 
< 0.1%
Spacious apartment near Vondelpark7
 
< 0.1%
Cosy apartment near Vondelpark6
 
< 0.1%
Other values (18140)18566
99.4%

Length

Histogram of lengths of the category
ValueCountFrequency (%)
apartment7118
 
6.7%
in5730
 
5.4%
amsterdam3588
 
3.4%
3195
 
3.0%
with2669
 
2.5%
the2165
 
2.0%
spacious2082
 
2.0%
city1906
 
1.8%
centre1768
 
1.7%
room1728
 
1.6%
Other values (4867)73723
69.8%

Most occurring characters

ValueCountFrequency (%)
87491
 
13.0%
e59230
 
8.8%
t55217
 
8.2%
a52626
 
7.8%
r42831
 
6.4%
n39759
 
5.9%
o35472
 
5.3%
i32482
 
4.8%
m26379
 
3.9%
s21195
 
3.1%
Other values (147)221198
32.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter510398
75.7%
Space Separator87492
 
13.0%
Uppercase Letter54936
 
8.2%
Other Punctuation11184
 
1.7%
Decimal Number5572
 
0.8%
Dash Punctuation1595
 
0.2%
Math Symbol1136
 
0.2%
Close Punctuation621
 
0.1%
Open Punctuation588
 
0.1%
Other Symbol236
 
< 0.1%
Other values (10)122
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
2
 
5.1%
2
 
5.1%
2
 
5.1%
2
 
5.1%
2
 
5.1%
2
 
5.1%
2
 
5.1%
1
 
2.6%
1
 
2.6%
1
 
2.6%
Other values (22)22
56.4%
Lowercase Letter
ValueCountFrequency (%)
e59230
11.6%
t55217
10.8%
a52626
10.3%
r42831
 
8.4%
n39759
 
7.8%
o35472
 
6.9%
i32482
 
6.4%
m26379
 
5.2%
s21195
 
4.2%
p19825
 
3.9%
Other values (20)125382
24.6%
Uppercase Letter
ValueCountFrequency (%)
A8892
16.2%
C6863
12.5%
S4399
 
8.0%
L3283
 
6.0%
B3251
 
5.9%
R2791
 
5.1%
P2694
 
4.9%
E2341
 
4.3%
T2219
 
4.0%
N2194
 
4.0%
Other values (17)16009
29.1%
Other Punctuation
ValueCountFrequency (%)
,2817
25.2%
!2756
24.6%
&1686
15.1%
.1473
13.2%
'831
 
7.4%
/587
 
5.2%
@315
 
2.8%
"285
 
2.5%
:189
 
1.7%
*154
 
1.4%
Other values (7)91
 
0.8%
Decimal Number
ValueCountFrequency (%)
21885
33.8%
1992
17.8%
0741
 
13.3%
5498
 
8.9%
3463
 
8.3%
4412
 
7.4%
8150
 
2.7%
6150
 
2.7%
9145
 
2.6%
7136
 
2.4%
Other Symbol
ValueCountFrequency (%)
171
72.5%
33
 
14.0%
14
 
5.9%
5
 
2.1%
5
 
2.1%
3
 
1.3%
°3
 
1.3%
1
 
0.4%
1
 
0.4%
Math Symbol
ValueCountFrequency (%)
+660
58.1%
|460
40.5%
<5
 
0.4%
>4
 
0.4%
=3
 
0.3%
~2
 
0.2%
1
 
0.1%
÷1
 
0.1%
Open Punctuation
ValueCountFrequency (%)
(581
98.8%
[6
 
1.0%
1
 
0.2%
Close Punctuation
ValueCountFrequency (%)
)614
98.9%
]6
 
1.0%
1
 
0.2%
Space Separator
ValueCountFrequency (%)
87491
> 99.9%
 1
 
< 0.1%
Dash Punctuation
ValueCountFrequency (%)
-1593
99.9%
2
 
0.1%
Nonspacing Mark
ValueCountFrequency (%)
15
93.8%
1
 
6.2%
Control
ValueCountFrequency (%)
6
50.0%
6
50.0%
Final Punctuation
ValueCountFrequency (%)
9
81.8%
2
 
18.2%
Initial Punctuation
ValueCountFrequency (%)
3
60.0%
2
40.0%
Currency Symbol
ValueCountFrequency (%)
4
80.0%
$1
 
20.0%
Other Number
ValueCountFrequency (%)
²22
100.0%
Connector Punctuation
ValueCountFrequency (%)
_7
100.0%
Modifier Symbol
ValueCountFrequency (%)
´4
100.0%
Format
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin565334
83.9%
Common108491
 
16.1%
Han39
 
< 0.1%
Inherited16
 
< 0.1%

Most frequent character per script

Common
ValueCountFrequency (%)
87491
80.6%
,2817
 
2.6%
!2756
 
2.5%
21885
 
1.7%
&1686
 
1.6%
-1593
 
1.5%
.1473
 
1.4%
1992
 
0.9%
'831
 
0.8%
0741
 
0.7%
Other values (56)6226
 
5.7%
Latin
ValueCountFrequency (%)
e59230
 
10.5%
t55217
 
9.8%
a52626
 
9.3%
r42831
 
7.6%
n39759
 
7.0%
o35472
 
6.3%
i32482
 
5.7%
m26379
 
4.7%
s21195
 
3.7%
p19825
 
3.5%
Other values (47)180318
31.9%
Han
ValueCountFrequency (%)
2
 
5.1%
2
 
5.1%
2
 
5.1%
2
 
5.1%
2
 
5.1%
2
 
5.1%
2
 
5.1%
1
 
2.6%
1
 
2.6%
1
 
2.6%
Other values (22)22
56.4%
Inherited
ValueCountFrequency (%)
15
93.8%
1
 
6.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII673492
99.9%
Misc Symbols216
 
< 0.1%
Latin 1 Sup57
 
< 0.1%
CJK39
 
< 0.1%
Punctuation34
 
< 0.1%
VS16
 
< 0.1%
Dingbats14
 
< 0.1%
None7
 
< 0.1%
Currency Symbols4
 
< 0.1%
Math Operators1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
87491
 
13.0%
e59230
 
8.8%
t55217
 
8.2%
a52626
 
7.8%
r42831
 
6.4%
n39759
 
5.9%
o35472
 
5.3%
i32482
 
4.8%
m26379
 
3.9%
s21195
 
3.1%
Other values (82)220810
32.8%
Latin 1 Sup
ValueCountFrequency (%)
²22
38.6%
é15
26.3%
à4
 
7.0%
´4
 
7.0%
°3
 
5.3%
É3
 
5.3%
á2
 
3.5%
¡1
 
1.8%
 1
 
1.8%
÷1
 
1.8%
Misc Symbols
ValueCountFrequency (%)
171
79.2%
33
 
15.3%
5
 
2.3%
5
 
2.3%
1
 
0.5%
1
 
0.5%
None
ValueCountFrequency (%)
3
42.9%
2
28.6%
1
 
14.3%
1
 
14.3%
VS
ValueCountFrequency (%)
15
93.8%
1
 
6.2%
Dingbats
ValueCountFrequency (%)
14
100.0%
Punctuation
ValueCountFrequency (%)
15
44.1%
9
26.5%
3
 
8.8%
2
 
5.9%
2
 
5.9%
2
 
5.9%
1
 
2.9%
Math Operators
ValueCountFrequency (%)
1
100.0%
Currency Symbols
ValueCountFrequency (%)
4
100.0%
CJK
ValueCountFrequency (%)
2
 
5.1%
2
 
5.1%
2
 
5.1%
2
 
5.1%
2
 
5.1%
2
 
5.1%
2
 
5.1%
1
 
2.6%
1
 
2.6%
1
 
2.6%
Other values (22)22
56.4%

last_modified
Categorical

HIGH CARDINALITY
UNIFORM
UNIQUE

Distinct18671
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size73.0 KiB
2017-07-22 23:13:49.922770
 
1
2017-07-22 16:09:30.267337
 
1
2017-07-23 13:06:01.692221
 
1
2017-07-23 06:03:12.052431
 
1
2017-07-23 05:53:53.038499
 
1
Other values (18666)
18666 

Length

Max length26
Median length26
Mean length26
Min length26

Characters and Unicode

Total characters485446
Distinct characters14
Distinct categories4 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique18671 ?
Unique (%)100.0%

Sample

1st row2017-07-23 13:06:27.391699
2nd row2017-07-23 13:06:23.607187
3rd row2017-07-23 13:06:23.603546
4th row2017-07-23 13:06:22.689787
5th row2017-07-23 13:06:19.681469

Common Values

ValueCountFrequency (%)
2017-07-22 23:13:49.9227701
 
< 0.1%
2017-07-22 16:09:30.2673371
 
< 0.1%
2017-07-23 13:06:01.6922211
 
< 0.1%
2017-07-23 06:03:12.0524311
 
< 0.1%
2017-07-23 05:53:53.0384991
 
< 0.1%
2017-07-22 22:48:25.1215021
 
< 0.1%
2017-07-23 05:56:43.5708591
 
< 0.1%
2017-07-22 17:36:18.3838941
 
< 0.1%
2017-07-23 03:12:42.5304101
 
< 0.1%
2017-07-23 03:30:30.6341731
 
< 0.1%
Other values (18661)18661
99.9%

Length

Histogram of lengths of the category
ValueCountFrequency (%)
2017-07-2213653
36.6%
2017-07-235018
 
13.4%
22:44:06.2547581
 
< 0.1%
16:33:22.3547141
 
< 0.1%
16:07:11.9178131
 
< 0.1%
18:27:03.7673691
 
< 0.1%
20:01:57.3741921
 
< 0.1%
16:05:46.7251421
 
< 0.1%
18:04:31.2373611
 
< 0.1%
20:28:39.8501831
 
< 0.1%
Other values (18663)18663
50.0%

Most occurring characters

ValueCountFrequency (%)
281579
16.8%
065245
13.4%
755414
11.4%
148635
10.0%
-37342
7.7%
:37342
7.7%
329560
 
6.1%
522440
 
4.6%
419585
 
4.0%
619230
 
4.0%
Other values (4)69074
14.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number373420
76.9%
Other Punctuation56013
 
11.5%
Dash Punctuation37342
 
7.7%
Space Separator18671
 
3.8%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
281579
21.8%
065245
17.5%
755414
14.8%
148635
13.0%
329560
 
7.9%
522440
 
6.0%
419585
 
5.2%
619230
 
5.1%
816209
 
4.3%
915523
 
4.2%
Other Punctuation
ValueCountFrequency (%)
:37342
66.7%
.18671
33.3%
Dash Punctuation
ValueCountFrequency (%)
-37342
100.0%
Space Separator
ValueCountFrequency (%)
18671
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common485446
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
281579
16.8%
065245
13.4%
755414
11.4%
148635
10.0%
-37342
7.7%
:37342
7.7%
329560
 
6.1%
522440
 
4.6%
419585
 
4.0%
619230
 
4.0%
Other values (4)69074
14.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII485446
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
281579
16.8%
065245
13.4%
755414
11.4%
148635
10.0%
-37342
7.7%
:37342
7.7%
329560
 
6.1%
522440
 
4.6%
419585
 
4.0%
619230
 
4.0%
Other values (4)69074
14.2%

latitude
Real number (ℝ≥0)

HIGH CORRELATION

Distinct15560
Distinct (%)83.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean52.36525919
Minimum52.2962
Maximum52.42498
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size145.9 KiB

Quantile statistics

Minimum52.2962
5-th percentile52.343284
Q152.355253
median52.364623
Q352.3747995
95-th percentile52.3893735
Maximum52.42498
Range0.12878
Interquartile range (IQR)0.0195465

Descriptive statistics

Standard deviation0.01515023385
Coefficient of variation (CV)0.0002893184162
Kurtosis1.417392356
Mean52.36525919
Median Absolute Deviation (MAD)0.009737
Skewness0.007665258349
Sum977711.7543
Variance0.0002295295858
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
52.3546465
 
< 0.1%
52.3668525
 
< 0.1%
52.3613645
 
< 0.1%
52.3624534
 
< 0.1%
52.3790114
 
< 0.1%
52.3547484
 
< 0.1%
52.370044
 
< 0.1%
52.3632174
 
< 0.1%
52.3699174
 
< 0.1%
52.3551914
 
< 0.1%
Other values (15550)18628
99.8%
ValueCountFrequency (%)
52.29621
< 0.1%
52.2972031
< 0.1%
52.2997631
< 0.1%
52.2998461
< 0.1%
52.2998751
< 0.1%
52.3001051
< 0.1%
52.300131
< 0.1%
52.3009151
< 0.1%
52.3012571
< 0.1%
52.3016831
< 0.1%
ValueCountFrequency (%)
52.424981
< 0.1%
52.4246411
< 0.1%
52.4242551
< 0.1%
52.4236471
< 0.1%
52.4234981
< 0.1%
52.4234321
< 0.1%
52.4233211
< 0.1%
52.4228271
< 0.1%
52.4222321
< 0.1%
52.4222281
< 0.1%

longitude
Real number (ℝ≥0)

HIGH CORRELATION

Distinct17112
Distinct (%)91.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.888602383
Minimum4.763264
Maximum5.027689
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size145.9 KiB

Quantile statistics

Minimum4.763264
5-th percentile4.845295
Q14.864383
median4.886012
Q34.907499
95-th percentile4.9445575
Maximum5.027689
Range0.264425
Interquartile range (IQR)0.043116

Descriptive statistics

Standard deviation0.03455214945
Coefficient of variation (CV)0.007067899318
Kurtosis1.217272119
Mean4.888602383
Median Absolute Deviation (MAD)0.021551
Skewness0.5376597501
Sum91275.0951
Variance0.001193851032
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
4.9071875
 
< 0.1%
4.863014
 
< 0.1%
4.8887384
 
< 0.1%
4.8615124
 
< 0.1%
4.8770044
 
< 0.1%
4.9046464
 
< 0.1%
4.8565254
 
< 0.1%
4.8935064
 
< 0.1%
4.8930174
 
< 0.1%
4.8912674
 
< 0.1%
Other values (17102)18630
99.8%
ValueCountFrequency (%)
4.7632641
< 0.1%
4.7684521
< 0.1%
4.7691511
< 0.1%
4.7710831
< 0.1%
4.7727251
< 0.1%
4.7728221
< 0.1%
4.7751681
< 0.1%
4.7757481
< 0.1%
4.776471
< 0.1%
4.777641
< 0.1%
ValueCountFrequency (%)
5.0276891
< 0.1%
5.0267011
< 0.1%
5.0157371
< 0.1%
5.0135571
< 0.1%
5.0133161
< 0.1%
5.0130751
< 0.1%
5.0125491
< 0.1%
5.0116931
< 0.1%
5.0116881
< 0.1%
5.0115691
< 0.1%

location
Categorical

HIGH CARDINALITY
UNIFORM
UNIQUE

Distinct18671
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size73.0 KiB
0101000020E6100000B81FF0C00072134050C763062A2D4A40
 
1
0101000020E6100000DC662AC423911340438D4292592D4A40
 
1
0101000020E6100000E8A4F78DAF9D134083DA6FED442D4A40
 
1
0101000020E61000001F2DCE18E66413400454388254304A40
 
1
0101000020E610000048C0E8F2E6601340E8C072840C304A40
 
1
Other values (18666)
18666 

Length

Max length50
Median length50
Mean length50
Min length50

Characters and Unicode

Total characters933550
Distinct characters16
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique18671 ?
Unique (%)100.0%

Sample

1st row0101000020E610000033FAD170CA8C13403BC5AA41982D4A40
2nd row0101000020E6100000842A357BA095134042791F4773304A40
3rd row0101000020E6100000A51133FB3CC613403543AA285E2B4A40
4th row0101000020E6100000DF180280638F134085EE92382B304A40
5th row0101000020E6100000CD902A8A57691340187B2FBE682F4A40

Common Values

ValueCountFrequency (%)
0101000020E6100000B81FF0C00072134050C763062A2D4A401
 
< 0.1%
0101000020E6100000DC662AC423911340438D4292592D4A401
 
< 0.1%
0101000020E6100000E8A4F78DAF9D134083DA6FED442D4A401
 
< 0.1%
0101000020E61000001F2DCE18E66413400454388254304A401
 
< 0.1%
0101000020E610000048C0E8F2E6601340E8C072840C304A401
 
< 0.1%
0101000020E610000073840CE4D9951340A56ABB09BE2F4A401
 
< 0.1%
0101000020E61000007978CF81E568134016325706D5304A401
 
< 0.1%
0101000020E61000002766BD18CA791340FE65F7E4612F4A401
 
< 0.1%
0101000020E6100000F8713447567E1340F4C0C760C52F4A401
 
< 0.1%
0101000020E6100000E2AFC91AF5801340B0E8D66B7A2E4A401
 
< 0.1%
Other values (18661)18661
99.9%

Length

Histogram of lengths of the category
ValueCountFrequency (%)
0101000020e61000007f6c921ff1ab13404224438ead2d4a401
 
< 0.1%
0101000020e6100000b6476fb88f6c13409487855ad3304a401
 
< 0.1%
0101000020e6100000529b38b9dff11340a089b0e1e9254a401
 
< 0.1%
0101000020e610000041d653abafbe1340d00d4dd9e9314a401
 
< 0.1%
0101000020e6100000469a780778921340bda8ddaf022e4a401
 
< 0.1%
0101000020e61000005feb5223f4c313409badbce47f2e4a401
 
< 0.1%
0101000020e61000009da1b8e34d6e1340f1f09e03cb2d4a401
 
< 0.1%
0101000020e610000029cb10c7ba681340fd2e6ccd562e4a401
 
< 0.1%
0101000020e610000087c0914083ad13406bf12900c62d4a401
 
< 0.1%
0101000020e610000076711b0de06d1340b6813b50a72e4a401
 
< 0.1%
Other values (18661)18661
99.9%

Most occurring characters

ValueCountFrequency (%)
0289022
31.0%
1100127
 
10.7%
481005
 
8.7%
257879
 
6.2%
347959
 
5.1%
E47395
 
5.1%
646025
 
4.9%
A44985
 
4.8%
D28459
 
3.0%
828124
 
3.0%
Other values (6)162570
17.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number730873
78.3%
Uppercase Letter202677
 
21.7%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0289022
39.5%
1100127
 
13.7%
481005
 
11.1%
257879
 
7.9%
347959
 
6.6%
646025
 
6.3%
828124
 
3.8%
728002
 
3.8%
927802
 
3.8%
524928
 
3.4%
Uppercase Letter
ValueCountFrequency (%)
E47395
23.4%
A44985
22.2%
D28459
14.0%
F28051
13.8%
C27338
13.5%
B26449
13.0%

Most occurring scripts

ValueCountFrequency (%)
Common730873
78.3%
Latin202677
 
21.7%

Most frequent character per script

Common
ValueCountFrequency (%)
0289022
39.5%
1100127
 
13.7%
481005
 
11.1%
257879
 
7.9%
347959
 
6.6%
646025
 
6.3%
828124
 
3.8%
728002
 
3.8%
927802
 
3.8%
524928
 
3.4%
Latin
ValueCountFrequency (%)
E47395
23.4%
A44985
22.2%
D28459
14.0%
F28051
13.8%
C27338
13.5%
B26449
13.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII933550
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0289022
31.0%
1100127
 
10.7%
481005
 
8.7%
257879
 
6.2%
347959
 
5.1%
E47395
 
5.1%
646025
 
4.9%
A44985
 
4.8%
D28459
 
3.0%
828124
 
3.0%
Other values (6)162570
17.4%

Interactions

Correlations

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

A simple visualization of nullity by column.
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

First rows

room_idsurvey_idhost_idroom_typecityneighborhoodreviewsoverall_satisfactionaccommodatesbedroomspricenamelast_modifiedlatitudelongitudelocation
010176931147649180562Shared roomAmsterdamDe Pijp / Rivierenbuurt74.521.0156.0Red Light/ Canal view apartment (Shared)2017-07-23 13:06:27.39169952.3562094.8874910101000020E610000033FAD170CA8C13403BC5AA41982D4A40
18935871147646718394Shared roomAmsterdamCentrum West454.541.0126.0Sunny and Cozy Living room in quite neighbours2017-07-23 13:06:23.60718752.3785184.8961200101000020E6100000842A357BA095134042791F4773304A40
214011697147610346595Shared roomAmsterdamWatergraafsmeer10.031.0132.0Amsterdam2017-07-23 13:06:23.60354652.3388114.9435920101000020E6100000A51133FB3CC613403543AA285E2B4A40
3613797814768685430Shared roomAmsterdamCentrum West75.041.0121.0Canal boat RIDE in Amsterdam2017-07-23 13:06:22.68978752.3763194.8900280101000020E6100000DF180280638F134085EE92382B304A40
418630616147670191803Shared roomAmsterdamDe Baarsjes / Oud West10.021.093.0One room for rent in a three room appartment2017-07-23 13:06:19.68146952.3703844.8528730101000020E6100000CD902A8A57691340187B2FBE682F4A40
55790170147629968916Shared roomAmsterdamDe Pijp / Rivierenbuurt1844.521.0102.0Beautiful apartment2017-07-23 13:06:19.66397552.3422654.8971260101000020E6100000B090B932A896134060C8EA56CF2B4A40
693406014765037506Shared roomAmsterdamOostelijk Havengebied / Indische Buurt675.0161.0462.0LOTUS, Classic Dutch Saling Barge2017-07-23 13:06:09.98801652.3775524.9304180101000020E61000005D70067FBFB813400B45BA9F53304A40
7195900491476132687356Shared roomAmsterdamWesterpark20.021.0414.0big boot Adam 042017-07-23 13:06:09.98474852.3752054.8661170101000020E6100000DD09F65FE7761340D925AAB706304A40
8502028014764059485Shared roomAmsterdamOud Oost20.021.0222.0Bright modern appartment in East!2017-07-23 13:06:07.45260952.3573464.9128870101000020E610000032C687D9CBA613409FAD8383BD2D4A40
915810783147684978218Shared roomAmsterdamCentrum West00.0121.0301.0CANAL BOATTOUR AMSTERDAM covered boat 1,5 hour2017-07-23 13:06:07.44798952.3866104.8901280101000020E6100000FB03E5B67D8F13403D27BD6F7C314A40

Last rows

room_idsurvey_idhost_idroom_typecityneighborhoodreviewsoverall_satisfactionaccommodatesbedroomspricenamelast_modifiedlatitudelongitudelocation
186612763386147614122005Private roomAmsterdamSlotervaart1185.021.036.0Comfortable SKY ROOM 12th floor2017-07-22 16:05:14.17317552.3610434.8461340101000020E6100000792288F37062134091B932A8362E4A40
18662192032561476132265798Private roomAmsterdamBijlmer Centrum10.041.035.0NEW Stylish room, Ziggodome, AFAS LIVE, ArenA, RAI2017-07-22 16:05:14.16879952.3200494.9556090101000020E6100000950D6B2A8BD213400A0F9A5DF7284A40
18663197341781476139135665Private roomAmsterdamOsdorp00.010.030.0Cozy Apartment in Nieuw-West2017-07-22 16:05:14.16641052.3567024.7923460101000020E61000003677F4BF5C2B13407A354069A82D4A40
1866428896714761501422Private roomAmsterdamDe Baarsjes / Oud West2815.031.036.0BandB de Baarsjes Amsterdam2017-07-22 16:05:14.16397352.3619184.8555070101000020E61000000DFFE9060A6C1340B8EA3A54532E4A40
186651668538314765831960Private roomAmsterdamBos en Lommer55.021.030.0A nice bed in the attic of my 'palace'.2017-07-22 16:05:14.16171452.3796384.8488290101000020E6100000E695EB6D33651340D0285DFA97304A40
1866617789893147647501089Private roomAmsterdamBijlmer Centrum105.031.032.01-3 pers. Cozy Rm AFAS Live, ArenA, ZIGGODOME2017-07-22 16:05:14.15896352.3197944.9556380101000020E6100000684293C492D2134080BA8102EF284A40
1866716877166147667093870Private roomAmsterdamBijlmer Centrum65.041.024.0Modern Room by Arena, ZIGGO, HmH2017-07-22 16:05:14.15198652.3190804.9548220101000020E61000005801BEDBBCD1134062670A9DD7284A40
1866819859427147629724632Private roomAmsterdamGeuzenveld / Slotermeer00.011.038.0Private single room2017-07-22 16:05:14.14961052.3840284.8384030101000020E61000002079E750865A1340C85F5AD427314A40
18669171321641476115156569Private roomAmsterdamCentrum West134.521.036.0City Center studio in Touristic Amsterdam 12017-07-22 16:05:14.14618352.3721204.8909820101000020E6100000774CDD955D9013400118CFA0A12F4A40
186707605782147639503013Private roomAmsterdamCentrum West1134.521.035.0I have a room available for rent2017-07-22 16:05:12.25705452.3813924.8996580101000020E6100000CD565EF23F9913405F7AFB73D1304A40